Risk-aware multi-armed bandit problem with application to portfolio selection
نویسندگان
چکیده
منابع مشابه
Risk-aware multi-armed bandit problem with application to portfolio selection
Sequential portfolio selection has attracted increasing interest in the machine learning and quantitative finance communities in recent years. As a mathematical framework for reinforcement learning policies, the stochastic multi-armed bandit problem addresses the primary difficulty in sequential decision-making under uncertainty, namely the exploration versus exploitation dilemma, and therefore...
متن کاملDigital Forensics Tool Selection with Multi-armed Bandit Problem
Digital forensics investigation is a long and tedious process for an investigator in general. There are many tools that investigators must consider, both proprietary and open source. Forensics investigators must choose the best tool available on the market for their cases to make sure they do not overlook any evidence resides in suspect device within a reasonable time frame. This is however har...
متن کاملThe multi-armed bandit problem with covariates
We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realization which depends on an observable random covariate. As opposed to the traditional static multi-armed bandit problem, this setting allows for dynamically changing rewards that better describe applications where side information is available. We adopt a nonparametric model where the expected rewa...
متن کاملMulti-armed bandit problem with precedence relations
Abstract: Consider a multi-phase project management problem where the decision maker needs to deal with two issues: (a) how to allocate resources to projects within each phase, and (b) when to enter the next phase, so that the total expected reward is as large as possible. We formulate the problem as a multi-armed bandit problem with precedence relations. In Chan, Fuh and Hu (2005), a class of ...
متن کاملMulti-armed bandit problem with known trend
We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different on-line problems like active learning, music and interface recommendation applications, where when an arm is sampled by the model the received re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Royal Society Open Science
سال: 2017
ISSN: 2054-5703
DOI: 10.1098/rsos.171377